The world is not yet flat: Transport costs matter!

(c) Kristian Behrens, W. Mark Brown, and Théophile Bougna

—

Important information on compiling and running the C codes.


The GNU Scientific Libraries have to be linked (in particular libgsl.a). They are available at https://www.gnu.org/software/gsl/. We used version 1.16 of the libraries.

All header search paths and library search paths have to be fully specified in the build options. The code was developed on a MacPro/MacBookPro running OS 10.12 and 10.13. The SDK version is macOS 10.12). The code was also tested on older SDKs. Please make sure that you use the correct SDK, segmentation faults may result from improper specifications of the SDK.

For the implementation at Statistics Canada, the code was compiled on a Microsoft server using Visual C++.


The command line structure for running the executable is as follows:

./[EXEC_NAME] ./industry_list.txt ./location_list.txt ./[OUTPUT_PATH]/ cut_dist replics weighted fast_approx 0 CDF_proc 

—

EXEC_NAME		= name of the executable you produced

industry_list.txt	= file that contains the list of samples to be processed (see example; the first line contains the number of files to be processed, each subsequent line is an input filename)

location_list.txt	= file that contains the list of sampling spaces to be used (see example; each line is a filename containing the counterfactuals that are associated with the corresponding sample in inlist.txt)

OUTPUT_PATH		= directory where the output is written (there are two files, ‘_emp’ contains the empirical k-density, and ‘_mcs’ contains the counterfactual bounds). Please include the terminal ‘/‘.

cut_dist		= maximum distance (in km) over which the K-densities are computed (we used 800 km in the paper)

replics			= number of sampling replications used to compute the confidence bands (we used 1000)

weighted		= 0 for the unweighted case, and 1 if weights are used

fast_approx		= 0 if the exact Gaussian density is used, and 1 for the use of a precomputed table of that density; this should always be set to 1 as it is MUCH faster. For small samples (<30), the code forces the use of the exact procedure, but this should be a rare exception

0			= don’t touch this

CDF_proc		= 0 (default) or 1 if the unsmoothed CDF is to be computed

—

Here is a sample test and output (using only 10 replications for illustrative purposes):


C33175:kdense behrens_k$ ./kdense2 ./industry_list.txt ./location_list.txt ./output/ 800 10 0 1 0 0

./kdense2: Number of industries to be processed = 1
./kdense2: Not using weights in computations
./kdense2: Using fast interpolation approximation of gsl_sf_erf_Z
./kdense2: Running standard K-density computions
./kdense2: Running standard K-density computions
./kdense2: START
./kdense2: Using location universe ./location_universe_test.txt
./kdense2: Size of sampling location universe = 35336
./kdense2: Running 10 sampling replications, cutoff distance = 800
./kdense2: Processing industry naics_sample_test with 250 firms
./kdense2: (p25+, p75+) industry distance = (-1266.6064, 1266.6064)
./kdense2: k-density optimal bandwidth set to 220.3154
./kdense2: Starting sampling (progress display step = 1 replics)
./kdense2:  0% done
./kdense2: 10% done
./kdense2: 20% done
./kdense2: 30% done
./kdense2: 40% done
./kdense2: 50% done
./kdense2: 60% done
./kdense2: 70% done
./kdense2: 80% done
./kdense2: 90% done
./kdense2: Sampling output written to ./output/naics_sample_test_kdense_mcs.txt
./kdense2: Total time for processing naics_sample_test = 6 seconds
./kdense2: END


—

File structures:

The industry files have the following structure:

First line: Number of observations 0 0 
Other lines: latitude longitude weight [e.g., employment]

The sampling files have the following structure:

First line: Number of locations 0
Other lines: latitude longitude


Each file must have the number of observations and lines matching, else there will be segmentation faults. Sample files are provided with the code for illustrative purposes. We also provide a MacOS X 10.13 precompiled executable (kdense_exec). Please run this as a command-line tool in the terminal.
